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1. Introduction and overview 

Our main goal in this paper is to study global rates of convergence of the Maximum Likelihood Estimator 
(MLE) in one simple model for multivariate interval-censored data. In section 3 we will show that under 
some reasonable conditions the MLE converges in a Hellinger metric to the true distribution function on 
at a rate no worse than n~^/'^(log n)''''' for = (5(i — 4)/6 for all d > 2. Thus the rate of convergence is 
only worse than the known rate of n~^/^ for the case d = 1 by a factor involving a power of logn growing 
linearly with the dimension. These new rate results rely heavily on recent bracketing entropy bounds for 
d— dimensional distribution functions obtained by Gao [2012]. 

We begin in Section 2 with a review of interval censoring problems and known results in the case d = 1. 
We introduce the multivariate interval censoring model of interest here in Section 3, and obtain a rate of 
convergence for this model for 0? > 2 in Theorem 3.1. Most of the proofs are given in Section 4, with the 
exception being a key corollary of Gao [2012], the statement and proof of which are given in the Appendix 
(Section 6). Finally, in Section 5 we introduce several related models and further problems. 



2. Interval Censoring (or Current Status Data) on M 

Let Y ^ Fq on R+, and let T Go on be independent of Y. Suppose that we observe Xi, . . . , A„ i.i.d. as 
X = (A, T) where A = 1[y<t] - Here Y is often the time until some event of interest and T is an observation 
time. The goal is to estimate Fq nonparametrically based on observation of the X^'s. 
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To calculate the likelihood, we first calculate the distribution of X for a general distribution function F: 
note that the conditional distribution of A conditional on T is Bernoulli: 

(A|T) - Bernoulh(p(r)) 

where p{T) — F{T). If Go has density with respect to some measure fi on R+, then X = (A,r) has 
density 

with respect to the dominating measure (counting measure on {0, 1}) x fi. 

The nonparametric Maximum Likelihood Estimator (MLE) Fn of Fq in this interval censoring model was 
first obtained by Ayer et al. [1955]. It is simply described as follows: let T(i) < • • • < T(„) denote the order 
statistics corresponding to Ti, . . . , T„ and let A(i), . . . , A(„) denote the corresponding A's. Then the part of 
the log- likelihood oi Xi, . . . , Xn depending on F is given by 



UF) = ^{A(,)logF(T(,)) + (l-A(,))log(l-F(T(,)))} 

n 

^ ^{A(,)logF, + (l-A(,))log(l-F,)} (2.1) 
1=1 

where 

< Fi < • • • < < 1. (2.2) 

It turns out that the maximizer Fn of (2.1) subject to (2.2) can be described as follows: let H* be the 
(greatest) convex minorant of the points {(i, X]j<i ^(j)) ■ * ^ {!' ■ • ■ ' "-}}• 



H*{t) = sup 



H{t) : H{i) < X;j<, A(j) for each < i < n 
H{0) = 0, and H is convex 



Let Fi denote the left-derivative of H* at T(j). Then (Fi,...,F„) is the unique vector maximizing (2.1) 
subject to (2.2), and we therefore take the MLE Fn of F to be 

n 

with the conventions T(o) = and T(^n+i) = oo. See Ayer et al. [1955] or Groeneboom and Wellner [1992], 
pages 38-43, for details. 

Groeneboom [1987] initiated the study of Fn and proved the following limiting distribution result at a 
fixed point tg. 

Theorem 2.1. (Groeneboom, 1987). Consider the current status model on R"*". Suppose that < 
i^o(^o)i G'o(to) < 1 o-'>T-d suppose that F and G are dijferentiable at to with strictly positive derivatives fo(to) 
and (?o(*o) respectively. Then 

ni/3(F„(io) - Foito)) ->d c{Fo,Go)Z 

where 

'Fo(io)(l-Fo(<o))/o(to)\'^' 



c{Fo,Go)^2 



2go{to) 
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and 

Z = argmin{W{t) + t'^} 
where W is a standard two-sided Brownian motion starting from 0. 

The distribution of Z has been studied in detail by Groeneboom [1989] and computed by 
Groeneboom and WeUner [2001]. Balabdaoui and Wellner [2012] show that the density /z of Z is log-concave. 

van de Gcer [1993] (see also van de Geer [2000]) obtained the following global rate result for pp . Recall 
that the HcUingcr distance h{p, q) between two densities with respect to a dominating measure fi is given by 

h'ip,q)^l y'{^/p-V^}V 
Proposition 2.2. (van de Geer, 1993) h{pp ,pFo) = Op(n~^/^). 

Now for any distribution functions F and _Fo the (squared) Hcllingcr distance h?(j)F tPFa) for the current 
status model is given by 



h'^iPF,PFo) 



i|y"(\/F-/Fb)'rfGo + J iVT~^ - ^1 - FofdGo^ 



2 

1 f{{VF-VFh){VF + VFh)V^^^ 



2 J {VF + VKr 



> 



2 J iVT^+vT^r 

\j{F- FofdGo + lJ{{l-F)-il- Fo))2dGo 



= ^J{F-FofdGo, (2.3) 
and hence Proposition 2.2 yields 

(Fniz) - Foiz)fdGoiz) = Op(n-2/3)^ (2.4) 



or |li^„-i^ollL.(Go) =Op(n-i/3). 

For generalizations of these and other asymptotic results for the current status model to more 
complicated interval censoring schemes for real-valued random variables Y , see e.g. Groeneboom and Wellner 
[1992], van de Geer [1993], Groeneboom [1996], van dc Geer [2000], Schick and Yu [2000], and 
Groeneboom, Maathuis and Wellner [2008a,b]. 

Our main focus in this paper, however, concerns one simple generalization of the interval censoring model 
for M introduced above to interval censoring in M.'^. We now turn to this generalization. 



3. Multivariate interval censoring: multivariate current status data 

Let Y = {Yi,...,Yd) ^ Fo on R+'^ = [0,oo)'^, and let T = (Ti, . . . , Td) - Go on be independent 
of y. We assume that Go has density go with respect to some dominating measure fi on Mf^. Suppose we 
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observe 2£.i,---,Kn i-i-d- as X = (A,T) where A = (Ai,...,Arf) is given by Aj = l[y,<T,], j = 1, ■■■,(!. 
Equivalently, with a slight abuse of notation, X = (E, T) where F = (Fi, . . . , r2ci) is a vector of length 
2"^ consisting of O's and I's and with at most one 1 which indicates into which of the 2'^ orthants of 
determined by T the random vector Y_ belongs. More explicitly, define K = 1 + X]j=i(l ~ Aj)2^~^. Then 
set Tk = l{fc = X} for fc = 1, . . . , 2^^, so that Tk = 1 and = for Z e {1, . . . , 2'^} \ {K}. Much as for 
univariate current status data, Y_ represents a vector of times to events, T is a vector of observation times, 
and the goal is nonparamctric estimation of the joint distribution function Fq of Y_ based on observation of 
the X/s. See Dunson and Dinse [2002], Jewell [2007], Wang [2009], and Lin and Wang [2011] for examples 
of settings in which data of this type arises. 

To calculate the likelihood, we first calculate the distribution of X for a general distribution function F: 
note that the conditional distribution of F conditional on T is Multinomial: 

(r|T)^Mult2.(l,p(T;F)) 

where p{T;F) ^ {pi{T; F), . . . ,p2d{T; F)) and the probabilities Pj{t;F), j ^ 1, . . . ,2'^, t e are 
determined by the F measures of the corresponding sets. Then our model V for multivariate current status 
data is the collection of all densities with respect to the dominating measure (counting measure on {0, 1}^ ) x 
fjL given by 

for some distribution function F on IR+'' where t G and 7j S {0, 1} with J2j=i Ij — 1- 
Now the part of the log-likelihood that depends on F is given by 

i=\ j=i 

and again the MLE Fn of the true distribution function Fq is given by 

Fn — argmax{/„(i^) : is a distribution function on (3-1) 

For example, when d = 2, we can write Fi = A1A2, F2 = (1 — Ai)A2, F3 = Ai(l — A2), and F4 = 
(1 - Ai)(l - A2), and then 

pi(r;F)=F(Ti,r2), 

P2(T; = F(oo, T2) - F(Ti, T2), 

P3{T;F) ^ F{Ti,^) - FiT,,T2), 

PiiT- F) = l- F{Ti,^) - Ficx), T2) + F(Ti, T2). 

Thus 

4 4 
PF{L = l\T) = X{p,{T-Fr^, for 7= (71,72,73,74), ij e {QA}, Y^l, ^ I. 

Note that 



P,{t;F)= lc^it){y)dF{y), ] = !,... A (3.2) 
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where 

Ciit) = [0,h] X [0,i2], 

^2^) = [0,tl] X (<2,C»), 

Cait) = (ti,c») X [0,t2], 
Ciit) = (ii,oo) X (i2,oo). 

Characterizations and computation of the MLE (3.1), mostly for the case d = 2 have been treated in Song 
[2001], Gentleman and Vandal [2002], and Maathuis [2005, 2006]. Consistency of the MLE for more general 
interval censoring models has been established by Yu, Yu and Wong [2006]. For an interesting application 
see Betensky and Finkelstein [1999]. This example and other examples of multivariate interval censored data 
are treated in Sun [2006] and and Deng and Fang [2009]. For a comparison of the MLE with alternative 
estimators in the case d = 2, see Groeneboom [2012a]. 

An analogue of Groeneboom's Theorem 2.1 has not been established in the multivariate case. Song [2001] 
established an asymptotic minimax lower bound for pointwise convergence when d = 2: if -Fq and Gq have 
positive continuous densities at <g, then no estimator has a local minimax rate for estimation of i^o(io) 
faster than By making use of additional smoothness hypotheses, Groeneboom [2012a] has constructed 

estimators which achieve the pointwise n~^^^ rate, but it is not yet known if the MLE achieves this. 

Our main goal here is to prove the following theorem concerning the global rate of convergence of the 
MLE F„. 

Theorem 3.1. Consider the multivariate current status model. Suppose that Fq has supp{Fq) C [0,il/]'' and 
that Fq has density fo which satisfies 

cr' < foiy) < ci for all y e [0, Mf (3.3) 

where < Ci < oo. Suppose that Gq has density which satisfies 

c^' < go{y) < C2 for all y £ [0, Mf. (3.4) 
Then the MLE p„ = pp of po = PFo satisfies 

h{pn,Po) = Op ^,^3 j 

for 7 = 7d = (5d - 4)/6. 

Since the inequality (2.3) continues to hold in M*^ for d> 2 (with 1/4 replaced by 1/8 on the right side), 
we obtain the following corollary: 

Corollary 3.2. Under the conditions of Theorem 3.1 it follows that 

f (F„(z) - Foiz)fdGoiz) = Op(n-2/3(logn)'5) 
/or/3 = /3d = 27d = (5d-4)/3. 
4. Proofs 

Here we give the proof of Theorem 3.1. The main tool is a method developed by van de Geer [2000]. We 
will use the following lemma in combination with Theorem 7.6 of van de Geer [2000] or Theorem 3.4.1 of 
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van der Vaart and Wellner [1996] (Section 3.4.2, pages 330-331). Without loss of generality we can take 
M = 1 where M is the upper bound of the support of F (see Theorem 3.1). 

Let P be a collection of probability densities p on a sample space X with respect to a dominating measure 
fi. Define 

giconv) ^ . pepl (4.1) 

Vp + po J 

cr((5) = sup{cr > : / padii < S^} for 6 > 0, (4.2) 

J {po<iy} 

(-™) ^ |-^l[p„>,] -.per], for a > 0. (4.3) 



.P + Po 

The following general result relating the bracketing entropies log N[]{-,Q^'^°™\ L2{Po)), 
logiV[](.,g(^°;'''\L2(Po)), log7V[](-,7'L2(Q.(.))), and \ogN^]{-,rMQa(e))) is due to van de Geer [2000]. 
Lemma 4.1. (van de Geer, 2000) For every e > 

log7V[,(3e,Cy(™™'),i2(Po)) < logiV[](e,c;(™™\i2(Po)) (4.4) 

< logiV[](e/2,7',L2(Q,(,))) (4.5) 

where dQ„ = p^'^l^.p^^y^^dp, and Q„ = Qa/Qa{X). 

Proof. We first show that (4.4) holds. Suppose that {[gL,j,gu,j\, J = 1, • ■ • , are e-brackets with respect 

con^ 



to L2{Pq) for with 



Then for g G g('^°™) ^ let = g\pa>a] be the corresponding element of Suppose that g^ G [ffLj j^c/j] 

for some j G {1, . . . , m}. Then 

where, by the triangle inequality, < <7 < 2 for all g G and the definition of (7{e). it follows that 

\\gu.j - 5L,j||p^^.2 - hud ~ ■9ij||p„^2 + 2e < 3e. 

Thus {[gLj^gu,]] '■ j £ is a collection of 3e— brackets for Cj(co"f) with respect to L2{Po) and 

hence (4.4) holds. 

Now we show that (4.5) holds. Suppose that {[PL,j,Pu,j] ■ j ~ ■ ■ ■ , w} is a set of e/2— brackets with 
respect to L2{Qa) for V with 

m 

'P c\J[pL,j,Pu,j] and m-iV[](e/2,7',i2(Qa(e))). 
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Suppose p S [PL,j,Pu,j] for some j. Then, since 



2p 



< 



2p[/ j , _ 
u,j+Po [po><t] = 9U,j, 



P + Po 



[po>o 



where 



\gu,3-9L,j\ 

'^PUJ 



PU,j +P0 PU.3 +P0 



2(mj -PLj) - 
PL,j+PO 



'-[pn>cr] 



< 



2 be/ J -PLjl - 



Po>o'J ■ 



Thus 



\gu,j - .9Lj|1po,2 < 2||p[/j -PljI 



< e, 



and hence {[gL,j,gu,j] '■ j — ■ ■ ■ is a set of e-brackets with respect to £2(^0) for ^7^^°'"'^. This shows 
that (4.5) holds. 

It remains only to show that (4.6) holds. But this is easy since llffUg^ 2 = H^ll^ 2 ' Qcri'^)- 
This lemma is based on van de Geer [2000], pages 101 and 103. Note that our constants differ slightly 
from those of van de Geer. □ 



Lemma 4.2. Suppose that Fq has density fo which satisfies, for some < ci < 00, 

1 



Cl 



<fo{y)<Ci for all ye [0,lf- 



(4.7) 



Then pq (which we can identify with the vector pq{- , Fq) ) satisfies 

PoAt; Fo)\ ""'J^h *\ for all t € [0, 1]^ 



Po.2^ it; Fo)\ ^ "ife ^\ for all te[0,lf. 

Proof. This follows immediately from the general d version of (3.2) and the assumption on /q. □ 
These inequalities can also be written in the following compact form: For fc = 1 + X]j=i(l ^ '^j)2"'~^ with 

Lemma 4.3. Suppose that the assumption of Lemma ^.2 holds. Suppose, moreover, that Gq has density qq 
which satisfies 



- < go{y) < C2 for all y G [0, if. 

C2 - 



(4.8) 
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Then 



J[po<(y] 



'[po<c 

Furthermore, with <j{5) = S'^ / {2'^{ciC2f') we have 



npo<'T(5)] 

Proof. The first inequality follows easily from Lemma 4.2: note that 

2<i 



/ Pod-tJ. = y2 PkiLFo)goit)dt 

JIpo<<^] fe=i"'[Pfc(t,-Fo)<a-] 

< 2" f Fo{t)go{m 

JlF„(t)ci„(t)<a] 



imt)goit)<cr] 

d 



< 2'^ciC2 / Wtjdt<2'^{ciC2fa. 



The second inequality follows from the first inequality of the lemma. □ 

Lemma 4.4. // the hypotheses of Lemmas 4-2 and 4-3 hold, then the measure defined by dQ^j = 
(l/po)l{Po > has total mass Qa{X) given by 

dQa ^ I —dfi 

- V / 1 

< 2^ / -^I^dt (4.9) 

= ?^{\og{c,c,/<j)r. (4.10) 

Proof. This follows from Lemma 4.2, followed by an explicit calculation. In particular, the equality in (4.10) 
follows from 

f — J dt = f dx by the change of variables = e~^^ , 

-'[n ■=! *j>''i n^=i ^j<iog(i/6)] 

= 1 (log(l/6))'' for < 6 < 1 

where the second equality follows by induction: it holds easily for d ~ 1 (and d = 2); and then an easy 
calculation shows that it holds for d if it holds for d — 1. □ 



Lemma 4.5. If the hypotheses of Lemmas 4. 2 and 4-3 hold, and d>2, then 

j5d/2-2 

e 

for all < e < some Eq some constant K < oo. 



logiV[](e,g(-""),L2(Po)) < /^MiZf)]! 
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Proof. This follows by combining the results of Lemmas 4.3 and 4.4 with Lemma 4.1, and then using 
Corollary 6.2 of the bracketing entropy bound of Gao [2012] and stated here as Theorem 6.1. Here is the 
explicit calculation: 



log7V[](6e,^(^°""',L2(P)) 
< losN, ' ^ 



< log N, 



< log N, 




2^[log((ciC2)3.2'^/(e2)]'^ 
Ve 




by Lemmas 4.3 and 4.4 



[log(l/e)] 



< K 



Mi/i)] 

Ve 



d/2 



log 



(log(l/e))^/2 



Ve 



for V = Vdic,,C2) 

2(d-l) 



by Corollary 6.2(b) 



< K 



~[\og{l/e)r/'-' 



for e sufficiently small. 



□ 



Proof. (Theorem 3.1) This follows from Lemma 4.5 and Theorem 7.6 of van de Geer [2000] or Theorem 3.4.1 
of van der Vaart and Wellner [1996] together with the arguments given in Section 3.4.2. By Lemma 4.5 the 
bracketing entropy integrals 



J[](<5,e(-™),L2(Po))= / y^l + logiV[](e,^(™™),i2(Po)) 



de < 



e-i/2{log(l/e)}'^'/' de 



where the boimd on the right side behaves asymptotically as a constant times 2S^^^{\og{l/S))^''''^^ with = 
5d/2 — 2, and hence (using the notation of Theorem 3.4.1 of van der Vaart and Wellner [1996]), we can take 
(l>n{6) = A'2(5i/2(log(l/(5))3''<'/2. Thus with r„ = n^/^ / (logn)l^ with /3 = 7^ we find that r2 0„(l/r„) - Ky/Ti 
and hence the claimed order of convergence holds. □ 



5. Some related models and further problems 

There are several related models in which we expect to see the same basic phenomenon as established here, 
namely a global convergence rate of the form n~^/^(log n)''' in all dimensions d>2 with only the power 7 of 
the log term depending on d. Three such models are: 

(a) the "in-out model" for interval censoring in M.'^; 

(b) the "case 2" multivariate interval censoring models studied by Deng and Fang [2009]; and 

(c) the scale mixture of uniforms model for decreasing densities in 

Here we briefly sketch why we expect the same phenomenon to hold in these three cases, even though we do 
not yet know pointwise convergence rates in any of these cases. 

5.1. The "in-out model" for interval censoring in M.'^ 

The "in-out model" for interval censoring in M."^ was explored in the case d = 2 by Song [2001]. In this model 
F ~ on M^, i? is a random rectangle in independent of Y_ (say [C/, j£] ^ {x ~ {xi,X2) € : Ui < 
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Xi <Vi, U2 < X2 < V2} where U_ and V_ are random vectors in with U_< Y_ coordinatewise) . We observe 
only iiRiy_),R), and the goal is to estimate the unknown distribution function F. 

Song [2001] (page 86) produced a local asymptotic minimax lower bound for estimation of -F at a fixed 
€ R^. Under the assumption that F has a positive density / at ^q, Song [2001] showed that any estimator 
of -F(to) can have a local-minimax convergence rate which is at best n'-^^^. Groencboom [2012a] has shown 
that this rate can be achieved by estimators involving smoothing methods. Based on the results for current 
status data in R'^ obtained in Theorem 3.1 and the entropy results for the class of distribution functions on 
M , wc conjecture that the global Hellinger rate of convergence of the MLE F„(io) will be n-^/^ilogn)" for 
all d > 2 where v — Vd- 

5.2. "Case 2" multivariate interval censoring models in 

Recall that "case 2" interval censored data on M is as follows: suppose that F ~ Fq on M+, the pair of 
observation times [U, V) with U < V determines a random interval {U, V], and we observe 2L = (Aj U, V) — 
(Ai, A2, A3,t/,F) where Ai = 1{Y < U}, A2 = 1{U < Y < V}, and A3 = 1{V < ¥}. Nonparametric 
estimation of Fq based on 2£.i, ■ ■ ■ i]L.n) i-i-d. as X_ has been discussed by a number of authors, including 
Groeneboom and Wellner [1992], Geskus and Groeneboom [1999], and Groeneboom [1996]. Deng and Fang 
[2009] studied generalizations of this model to W^, and obtained rates of convergence of the MLE with 
respect to the Hellinger metric given by ri,~'^^+'')/'^^(^+^'''(logn)'' /(2(2d+i) ^^y^ ^^^^^ most comparable to 
the multivariate interval censoring model studied here. While this rate reduces when d = 1 to the known 
rate n~^/'^(logri,)^/^, it is slower than n~^/^(log n)"^ for some v when d> 1 due to the use of entropy bounds 
involving convex hulls (see Deng and Fang [2009], Proposition A.l, page 66) which are not necessarily sharp. 
We expect that rates of the form n~^/'^(log n)" with > are possible in these models as well. 

5. 3. Scale mixtures of uniform densities on 

Pavlidcs [2008] and Pavlides and Wellner [2012] studied the family of scale mixtures of uniform densities of 
the following form: 

/g(£)-/ -J—l(^,yM)dG{y)^ I ^l(o,^](x)dG(j/) (5.1) 

for some distribution function G on {0,ooy. (Note that we have used the notation 0^=1 Vj = Ivl V ~ 
{yi, . . . ,yd) & M^''.) It is not difficult to see that such densities are decreasing in each coordinate and that 
they also satisfy 

{AdfG){u,v] = (-l)'^ / \y\-H^y,.^dG{y) > 

for all u,vE with u<v; here A^ denotes the d— dimensional difference operator. This is the same key 
property of distribution functions which results in (bracketing) entropies which depend on dimension only 
through a logarithmic term. The difference here is that the density functions fc need not be bounded, and 
even if the true density fo is in this class and satisfies /o(0) < 00, then wc do not yet know the behavior of the 
MLE /„ at zero. In fact we conjecture that: (a) If /o(Q) < 00 and fo is a scale mixture of uniform densities 
on rectangles as in (5.1), then /n(0) = Op((logn)'^) for some f] = jSd > 0- (b) Under the same hypothesis as 
in (a) and the hypothesis that fo has support contained in a compact set, the MLE converges with respect 
to the Hellinger distance with a rate that is no worse than ri~^/^(log ri)^ where ^ = ^d- Again Pavlides 
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[2008] and Pavlides and Wellner [2012] establish asymptotic minimax lower bounds for estimation of fo{xja) 
proving that no estimator can have a (local minimax) rate of convergence faster than in all dimensions. 

This is in sharp contrast to the class of block-decreasing densities on IR+'' studied by Pavlides [2012] and 
by Biau and Devroye [2003]: Pavlides [2012] shows that the local asymptotic minimax rate for estimation 
of fo{xo) is no faster than n~-^/'''^^^\ while Biau and Devroye [2003] show that there exist (histogram type) 
estimators /„ which satisfy Efg\\fn — /o||i = 0(r7,~^/(''+^'). 

6. Appendix 

We begin by summarizing the results of Gao [2012]. For a (probability) measure fi on [0, 1]'', let F = 
denote the corresponding distribution function given by 

F{x) ^ Ff,{x) ^ ^i{[0,x]) ^ K[0,xi] X ••• X [0,xd]) 

for all X = {xi, . . . , Xd) G [0, 1]''. Let denote the collection of all distribution functions on [0, 1]*^; i.e. 

J^d = {F : F is a distribution function on [0, 1]''}. 

For example, if Ad denotes Lebesgue measure on [0, 1]'', then the corresponding distribution function is 
Fix) = F^,{x) = UU''i- 

Theorem 6.1. (Gao, 2012). For d>2 and 1 < p < oo 

log 7V[] (e, Fd, LpiXd)) < e-^ (log(l/e))^('^-^) 

for allO<e<l. 

Our goal here is to use this result to control bracketing numbers for Fd with respect to two other measures 
Cd and Rd,a defined as follows. Let Cd denote the finite measure on [0, l]'^ with density with respect to Ad 
given by 



For fixed a > 0, let Rda denote the (probability) measure on (0,1]'' with density with respect to Ad given 

by 



Corollary 6.2. (a) For each d>2 it follows that for e < eo{d) 

log N[p^/\,Fd,L2{Cd)) < e-' (log(l/e))'(''-^) . 
(b) For each d> 2 and a < cro(d) it follows that for e < eo{d)/2 

logN[p^^'+h,Fd,L2{Rd,a)) < e-' ilogil/e)f 
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Proof. We first prove (a). We set p = pd = 2rd = 2r where r = rd = 2d — \ and s = (d — l/2)/(d — 1) satisfy 
+ = 1. Let {[gj, /ij], J = 1, ■ • ■ 7 m] be a collection of e— brackets for Fd with respect to Lp{\d)- (Thus 
for d = 2, r = 3, s = 3/2, and p = 6, while for d = 4, r 7, s = (13/2)/3 = 13/6, and p = 14.) By Theorem 
A.l we know that m < e"i(log(l/e))2(^-i). Now we bound the size of the brackets [gj,hj] with respect to 
Cd- Using Holder's inequality with l/r + l/s = las chosen above we find that 




To prove (b) we introduce monotone transformations tjiuj) and their inverses Uj{tj) which relate Cd and 
'rd,a-- we set 



= crexp(uy'^log(l/cr)) 

for J = 1, ... , m. These all depend on ct > 0, but this dependence is suppressed in the notation. 

For the same brackets [gj, hj] used in the proof of (a), we define new brackets [gj, hj] for j = 1, . . . , m by 

9j{t) = g],a{t) = gj{u{t)) = gj{ui{ti), . . .,Ud{td)), 
hjit) = ^^j.<rit) = hj{u{i)) = hj{ui{ti), . . .,Ud{td))). 
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Then it follows easily by direct calculation using 

d / d 

l[t, = a'^exp log(l/a)^ 



l/d 



a 

= n {<yoxp{\og{l/a)uy') ■ d-^uy-' ■ \og{l/ a){du,)] 



d d 



i=i i=i 

n > - I = jexp |^log(l/a) ^ u]/'] > .-(^-1) 

d 

\og{l / a) J2u f > {d^l)\og{l /a) 
that 

Thus for <T < cro('^) we have 

by the arguments in (a). Hence the brackets [gj, hj] yield a collection of 2''/^+^e— brackets for with respect 
to L2(-Rd,cT); and this implies that (b) holds. □ 
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